Predictive Discretization During Model Selection

نویسندگان

  • Harald Steck
  • Tommi S. Jaakkola
چکیده

We present an approach to discretizing multivariate continuous data while learning the structure of a graphical model. We derive a joint scoring function from the principle of predictive accuracy, which inherently ensures the optimal trade-off between goodness of fit and model complexity including the number of discretization levels. Using the socalled finest grid implied by the data, our scoring function depends only on the number of data points in the various discretization levels (independent of the metric used in the continuous space). Our experiments with artificial data as well as with gene expression data show that discretization plays a crucial role regarding the resulting network structure.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

(Semi-)Predictive Discretization During Model Selection

Data discretization is needed for various reasons. One reason is that there are many machine learning algorithms that can only be applied to discrete data. In order to use those algorithms, we need to discretize the data. We might also want to do that for solely computational reasons; some problems are easier to compute for discrete variables. Finally, if we know that our data is discrete, but ...

متن کامل

A New Hybrid Framework for Filter based Feature Selection using Information Gain and Symmetric Uncertainty (TECHNICAL NOTE)

Feature selection is a pre-processing technique used for eliminating the irrelevant and redundant features which results in enhancing the performance of the classifiers. When a dataset contains more irrelevant and redundant features, it fails to increase the accuracy and also reduces the performance of the classifiers. To avoid them, this paper presents a new hybrid feature selection method usi...

متن کامل

The Role of Discretization Parameters in Sequence Rule Evolution

As raw data become available in ever-increasing amounts, there is a need for automated methods that extract comprehensible knowledge from the data. In our previous work we have applied evolutionary algorithms to the problem of mining predictive rules from time series. In this paper we investigate the effect of discretization on the predictive power of the evolved rules. We compare the effects o...

متن کامل

Design and implementation of a model predictive controller for the COVID-19 spread restraint in Iran

 In this paper, a model is proposed based on the different levels of social restrictions for the COVID-19 spread restraint in Iran. Also, a Genetic Algorithm (GA) identifies parameters of model using reported main data from the Iranian Ministry of Health and simulated data based on proposed model. Whereas Model Predictive Control (MPC) is a popular method which has been widely used in process ...

متن کامل

Dichotomization of ICU Length of Stay Based on Model Calibration

This paper presents a method to choose the threshold for dichotomization of survival outcomes in a structured fashion based on data analysis. The method is illustrated with an application to the prediction problem of the outcome length of stay at Intensive Care Unit (ICU LOS). Threshold selection is based on comparing the calibration of predictive models for dichotomized outcomes with increasin...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2004